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Abstract. Broadcasting algorithms are of fundamental importance for distributed systems 
engineering. In this paper we revisit the classical and well-studied push protocol for message 
broadcasting. Assuming that initially only one node has some piece of information, at 
each stage every one of the informed nodes chooses randomly and independently one of its 
neighbors and passes the message to it. 

The performance of the push protocol on a fully connected network, where each node 
is joined by a link to every other node, is very well understood. In particular. Frieze and 
Grimmett proved that with probability 1 — o(l) the push protocol completes the broadcasting 
of the message within (1 ± £)(log2 n + Inn) stages, where n is the number of nodes of the 
network. However, there are no tight bounds for the broadcast time on networks that are 
significantly sparser than the complete graph. 

In this work we consider random networks on n nodes, where every edge is present with 
probability p, independently of every other edge. We show that if p > ^ where a(n) is 

any function that tends to infinity as n grows, then the push protocol broadcasts the message 
within (1 ± e)(log2 n + Inn) stages with probability 1 — o(l). In other words, in almost every 
network of density d such that d > a{n) Inn, the push protocol broadcasts a message as fast 
as in a fully connected network. This is quite surprising in the sense that the time needed 
remains essentially unaffected by the fact that most of the links are missing. 

1. Introduction 

We consider the problem of spreading information in large random networks with small 
average degree. Randomized broadcasting is among the most fundamental and well-studied 
communication primitives in distributed computing, and has also applications in several other 
disciplines, like e.g. in mathematical theories of epidemics. A particularly popular example [3] 
is the maintenance of consistency in a distributed database, which is replicated at many hun- 
dreds or thousands of sites in a large, heterogeneous network. Obviously, efficient broadcasting 
algorithms are crucial in order to ensure that all copies of the database converge quickly and 
effectively to the same content. 

There is an enormous amount of literature devoted to the theoretical and experimental 
evaluation of broadcasting algorithms on several different underlying networks. Our interest 
in considering random networks is motivated, among other reasons, by P2P (peer-to-peer) 
systems. The idea of using random graphs appears in some "real-life" networks, like the 
popular Gnutella network [8], or the Juxtapose protocol [2], which was originally developed 
by Sun Microsystems. Meanwhile, a considerable amount of work by several research groups 
aimed at designing many diverse networks for P2P systems that resemble properties of random 
graphs, see e.g. [HI [ISJ [TT], and at developing protocols that perform efficiently on random 
(nearly) regular networks [U SJ E] . 
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The most relevant properties of P2P networks, and more generally, of communication net- 
works, are high expansion, connectivity, small average degree, and, (approximate) regularity 
of the degrees of the nodes. The random graph model considered in this paper has these prop- 
erties. In particular, we investigate the classical Erdos-Renyi graph G„,^p, which is obtained 
by including each of the possible (2) edges that connect any two out of n labeled vertices 
with probability p, independently of all other edges. 

The Push Model. A classical protocol in the context of randomized broadcasting, which is 
also the main topic of our study, is the push model [3 [3] . There, initially some information 
is placed on one of the nodes. In each succeeding stage, every informed node passes the 
information to another node, that it chooses uniformly at random and independently among 
its neighbors. The crucial question now is: how long does it take until all nodes have received 
the information? There are several advantages of considering a broadcast algorithm like this: 
it is simple, local, and scalable, and thus independent of the network topology. Moreover, it 
is highly robust against network and link failures, which makes it highly reliable. 

In the case where the underlying network is the complete graph on n vertices. Frieze and 
Grimmett [7] proved that with high probabilitjQ (w.h.p.) the push protocol completes the 
broadcasting of the message within (1 it e)(log2 n + Inn) stages. In other words, if a node can 
"talk" to any other node in the network, then the broadcast time will be almost surely very 
close to log2 n + Inn. This bound was later improved by Pittel |13j to log2 n + Inn + a(n), 
where a(n) is any function that tends to 00 when n — > 00. Feige et al. considered in [6] 
networks that are different from the complete graph. Among other results, they showed that 
if the underlying network is a random graph Gn,p, where p > -^^-^j^^, then the message will 
arrive at all nodes with high probability within ©(ln?i) stages. Moreover, they also showed 
that the protocol is efficient on hypercubes, and derived bounds that hold for arbitrary graphs. 
Elsasser and Sauerwald determined in [5] similar bounds for several classes of Cayley graphs, 
thus generalizing upon [6]. 

Our contribution. Let G = {V, E) be a graph on n vertices, where we will assume that 
V = {1, . . . , n}. We define T{G) as the number of stages needed by the push protocol until all 
vertices have been informed, if the information is initially placed on node 1. In the remainder 
of the paper, we will be using the terms "node" and "vertex" without distinction. Note 
that regardless of the underlying network topology T(G) > log2 n, as the number of informed 
vertices can at most double in each round. Consequently, all the results mentioned above state 
that the push model is, up to multiplicative constants, an asymptotically optimal protocol 
for disseminating information. 

However, it is not at all well-understood how much the structure of the underlying network 
affects the performance of the push model. Although, for example, we know from the results 
in [6] that on a random graph Gn,p the protocol requires with high probability at most G In n 
rounds, for some C > 0, we have a priori no bounds than quantify how slower (or faster?) 
the protocol is compared to the case where the network is the complete graph. In particular, 
it is not clear in which way the average degree of the underlying graph influences the speed 
of the protocol. Our main result states that the number of stages is essentially unaffected by 
the density of the underlying graph, thus confirming the robustness and the efficiency of the 
push model: 
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Theorem 1.1. Let < a(n) < In^/^ n be any function with the property limjj_>oo a{n) = oo. 
Let p > Mz^lilLH, Then w.h.p. 

\T{Gn,p) - (log2n + lnn)| < a(n)"^/^lnn. 

In other words, if the average degree of Gn,p is sHghtly larger than Inn, then the broadcast 
time of the push model essentially coincides with the broadcast time on the complete graph, 
which was shown in [7j to be very close to log2n + Inn. Consequently, the number of stages 
needed is not influenced by the fact that most of the links are missing. 

To avoid any confusion, we want to note that in Theorem 11.11 the term "w.h.p" refers to 
two independent probability spaces: first, the space from which we sample the underlying 
network, and second, the space of the random choices performed by the nodes. 

Proof Ideas & Techniques. Before we proceed with a detailed exposition of our proof, let 
us mention a few words about the general strategy. Theorem 11.11 is proved by bounding for 
each stage performed by the push model simultaneously from above and and from below the 
number of informed nodes. In particular, we show that in the first (1 — o(l))log2n stages, 
the number of informed nodes nearly doubles in each stage. As a result, we are able to show 
that after nearly log2 n rounds there will be en informed nodes in total. Then things evolve 
very fast: only after a small number of stages, the number of nodes having the information 
will be already roughly (1 — e)n. After that, we show that additionally approximately Inn 
stages are necessary and sufficient to spread the information to everybody. 

The analysis of the last stages is particularly challenging from a technical point of view, as 
the number of informed nodes increases only slowly towards the end of the process. In such 
cases, it is typically difficult to control the deviations of several involved random variables 
from their expectations. To this end, we exploit a modern and powerful tool from probability 
theory called Talagrand's inequality, which - to our knowledge - has not been applied in the 
context of distributed computing problems. We believe that it could be widely applicable 
to the analysis of existing or future randomized protocols with several different degrees of 
dependency. 

Outline. Section [2] introduces the main tools from probability theory that we will use, and 
in particular Talagrand's inequality. In Section [3] we collect and prove the basic properties 
of Gn,p that will be important in the proof of Theorem II. H and introduce some necessary 
notation that will be used throughout. Finally, Section d] contains the "core" of the proofs, 
where the general strategy given above is converted to a rigorous argument. 

2. Preliminaries 

A basic tool that we will use in the following proofs is the Chernoff bound. This provides 
exponentially small bounds for the probability that a binomially distributed random variable 
deviates significantly from its expected value. 

Theorem 2.1 (Chernoff Bounds). Let X be a binomially distributed random variable and let 
X > 0. Then 

F(\X - E(X)\ > x) < 2exp f -—^ ) • 

A more general tool that we shall apply several times is the inequality by Azuma and 
Hoeffding. Intuitively, it provides strong bounds for the probability that a function defined 
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on a set of independent random variables deviates significantly from its expectation, when 
the value of the function is not affected much by small changes in each one of its arguments. 

Theorem 2.2 (Azuma-Hoeffding's Inequality). Let Zi, . . . , Zjy be independent random vari- 
ables taking values in the sets Ai, . . . , A^v respectively. Let A = Ai x • • • x A^r. Let / : A ^ M 
be a function and set X = f{Zi, . . . , Assume that there are quantities Ck, k = 1, . . . , N 
satisfying the following: 

a. If z, z' £ A differ only in the kth coordinate, then \ f{z) — f{z')\ < Ck- 
Then, for every x > we have that 

(2.1) P(|A:-E(A:)| > x) < 2exp 

Note that the above inequality gives meaningful bounds only if the expectation of X is much 
larger than (^^^ c?)^/^. This condition is unfortunately not always given in our intended 
applications. In such cases, we will use an estimate given by Talagrand (see the following 
theorem), which gives a much stronger tail bound, provided that an additional assumption 
is satisfied. Intuitively, the statement claims that if the value of X is "witnessed" by only a 
"small" number of its arguments, then X is sharply concentrated. However, there is a small 
caveat: the concentration is not guaranteed to be around the expectation, but instead around 
the median of X. (Recall that the median is a number m such that ¥{X < m) < ^ and 
F{X > 771) < i.) As we shall see below, this is not a significant problem as in general the 
median is very close the expected value. 

Theorem 2.3 (Talagrand's Inequality). Suppose that the preconditions of Theorem \2.S\ are 
satisfied. Additionally, assume that there is an increasing function ip satisfying the following: 

b. Let z £ A and r G R such that f{z) > r. Then there exists a set J C {1, . . . , N} with 
X^ieJ '^i — V'(^); such that for all y £ A with yi = Zi when i £ J, we have f{y) > r. 

Then, if m is the median of X, for every x > we have 

( x^ 

(2.2) P(|X-m|>x)<4exp -— 

The next statement gives a sufficient condition that ensures that the median is very close 
to the expected value. 

Proposition 2.4 (Example 2.33 in [lO]). Let X be a random variable that satisfies the pre- 
conditions of Theorem \2.3\ with ip{r) < \r~\. Then 



(2.3) \m-E{X)\ = 0{^/E{X)). 



The presentation of the above inequalities is as in [TO], where also many applications are 
presented. 

3. Properties of Gn,p 

For any graph G with vertex set V let Tg{v) be the set of neighbors of v in G. Moreover, 
for S,S' C V we will denote by edS, S') the number of edges with precisely one endpoint in 
each of S, S' . Finally, for two real numbers a, b we will write a it 6 for the interval of reals 
(a — 6, a + 6) , and with slight abuse of notation we will write X = a±b to denote X £ a±b. 
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Let a{n) > be any function with lim„_^oo ct{n) = oo and let p > ^ this section 



we collect a few properties of Gn.p that we will use in the proof of Theorem II. li 

Note that for any S gV, the expected number of neighbors of any -y G ^ \ in S is pIS*!. 
The next lemma says that for all large enough S almost all vertices have roughly the right 
degree in S. 

Lemma 3.1. The random graph Gn,p has w.h.p. for any a{n)~^^'^ < e < 1 the following 
property. For any subset S of its vertices satisfying \S\ > there is a set Xs C V\S that 

contains at most vertices such that 

yv€{V\S)\Xs: ITgUv) nS\ = {l± e)p\S\. 

Proof. Let S be any fixed subset of the vertices such that \S\ > ^^^^ We call a vertex v G V\S 
violating with respect to S, if the number of its neighbors in 5" is > (l + e)^!^! or < (1 — e)^!^!. 
Assume there exist at least t := vertices that are violating, and denote by Xs the set 
consisting of those vertices. 

Note that the expected number of neighbors in S" of a vertex is p\S\. By applying the 
Chernoff bounds, we obtain that the probability that a vertex is violating is for large n at 
most e~^ p\^\/^. Moreover, the events that two distinct vertices are violating are independent, 
which implies that the probability that there are t violating vertices is bounded from above 
by e~^ p|'5'l/4-i^ Hence, as there are (|^|) < n''^' = el*^''"" ways to choose S, the probability 
that there is a set such that there are t violating vertices with respect to it as at most 

exp||5|ln7i ~ j - ^""pII*^! ' 

This, combined with the bound p > can be estimated with plenty of room to spare 

from above by at most e"!*^!'"". The proof is completed by summing this expression up for 
all 151 > □ 

I I — a(n) 

The next statement considers a similar setting as before, with the difference that now S 
might be very small. Here we show that the number of vertices that have many neighbors 
in S is only o(|5|). 

Lemma 3.2. For any e > a(n)~^/^, the random graph Gn,p has w.h.p. the following prop- 
erty. For any subset S of its vertices such that \S\ < there is a set Xs containing at 

most \S\e~^a{n)~^ vertices, such that 

yve{V\S)\Xs: \rG„Jv) nS\< epn. 

Proof. The proof is similar to the proof of Lemma 13.11 except that here we have to deal 
with small sets S. We give the whole proof for the sake of completeness. We assume that 
IS"! > epn, for otherwise the statement holds trivially. 

Let S be any fixed subset of the vertices such that |5| < We call a vertex v \ S 

violating with respect to S, if the number of its neighbors in S" is > epn. Assume there exist 
at least t := vertices that are violating, and denote by Xs the set consisting of those 

vertices. 

The expected number of neighbors in 5 of a vertex v G y \ 5 is = o{epn). A 
straightforward application of the Chernoff bounds then implies that the probability that a 
vertex is violating is for large n at most e"^^*^. Hence, as the events that distinct vertices are 
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violating with respect to S are independent, the probability that there are t such vertices is 
at most e"^^*^'*. 

Note that the number of ways to choose S is (|^|) < (^)''^'- In conclusion, the probability 
that there is an S with t violating vertices is at most 

I^J exp{\S\\nn-epn-t} < l—j exp {|S'| (inn - pa(n) ^)} < f — J . 

The proof then completes by summing this expression up for all epn < |5| < ^j^- D 

Finally, we need the following statement about the distribution of the edges in Gn,p- The 
proof is a straightforward application of Chernoff 's bounds, and quite standard in the classical 
random graph theory. We include a short proof for completeness. 

Lemma 3.3. The following holds w.h.p. 

yS^V: eG„„(5, V\S) = \S\[n - \S\)p (l ± y/Sa{n)-^'^ 

Proof. It is sufficient to show the statements for S such that \S\ < n/2. For any fixed such 
S, the quantity eG„^p{S, V \S) is binomially distributed with expectation jS'Kn — \S\)p. Call 
5 bad, if ec^.p {S, V\S) deviates from its expected value by more than y^A\S\'^{n — I^Dplnn. 
Note that 



V^4|S|2(7i - \S\)plnn _ / 41nn ^ Sinn ^ 



\S\{n — \S\)p ynp{l — \S/n\) y np y a{n) 

By applying the Chernoff bounds we obtain that the probability that S is bad is with plenty 
of room to spare for large n at most 

3|5|(n-|5|)p j=^^P|-3|gU"^^j- 

Then, as the number of ways to choose S is at most n''^', we infer by summing over all 
1 ^ I*?! !^ n/2 that w.h.p. there is no bad set S in Gn,p- The proof completes readily by using 
that n — |5| > n/2 and the lower bound on p. □ 

Note that in the special case that 15*1 = 1 in the above lemma, i.e., S contains just a single 
vertex v, we obtain that 

|rG„„(^)| = ec^JS, V\S) = {1± 3«(n)-V2)p^. 

This fact will become very handy later and we will use it without further reference. 

4. Broadcasting on Random Graphs 

Let G be any graph with vertex set V and let p > where a(n) < In^^^ n is any 

positive function such that lim„^ooa('^) = oo. Fix 

e:=a(n)-V2. 



We say that G is p- typical if it satisfies the following three conditions: 

yve{V\S)\Xs: IFgW nS\ = {l± e)p\S\. 



(I) For any 5 C y such that |5| > ^ there is a Xs CV\S such that \Xs\ < ^ and 
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(II) For any 5 C T/ such that \S\ < ^ there isaXsCV\S such that \Xs\ < and 

yv e {V\S)\Xs : iTciv) nS\< epn. 

(Ill) For ah S C T/ 

eG{S,V\S) = \S\{n- \S\)p (l±V8e 



We win denote by Tn{p) the set of p- typical graphs on V . Note that Lemmas 13. m3.3l guarantee 
that Gn,p is w.h.p. p- typical. Hence, we shah restrict our attention only to graphs in Tn{p). 

Let us denote by Ti{G) the first point in time where at least en vertices are informed and 
T2{G) the first point in time where at least (1 — e)n vertices are informed. Our aim is to give 
bounds on T{G) by bounding Ti{G), T2{G) — Ti{G) and T{G) — T2{G) uniformly for every 
G G %iip)- The following three lemmas do so. In the proofs we will several times assume 
that n is sufficiently large so that the claimed inequalities hold, without explicitly mentioning 
that. 

Lemma 4.1. Uniformly for G G Tn{p), with probability 1 — o(l) it holds that 

|ri(G)-log2n| <9V^log2n. 

Proof. First of all, note that always Ti{G) > log2(en), as the number of informed nodes at 
most doubles in each stage. Hence, we restrict our attention to the proof of the upper bound 
for ri(G). 

Let It be the random set of informed vertices after t stages, and set It := \It\- Note that 
our definitions imply that 2q = {1}. We will show that 

(4.1) F{lt+i>{2-7V^)It\It<en)>l-\^\,' ^' . . 

I m ' n, otherwise 

The proof of the lemma then completes by a repeated application of the above inequality. In 
particular, either there is a t < (1 + 8-v/e)log2n such that It > en, in which case there is 
nothing to show, or, with probability 1 — o(l), 

^r(i+8v/i)iog,nl > (2 - 7^)(i+«v/^)i°^^" > en. 

So we showed that 

Ti{G) < \{1 + 8^^) log2 n] < (1 + 9^^) logs n 
In the remainder we prove ()4.ip . For every vertex v G It we define an indicator random 
variable that equals 1 if f informs a vertex in V \ If Moreover, for every pair of distinct 
vertices v, v' G It let Gy^y/ be the indicator variable that is equal to 1 if f and v' inform the 
same vertex in V \ It. Finally, denote by Mt the random set of vertices in V\It that will be 
informed in stage t + 1 by the vertices in It. By simple inclusion-exclusion we obtain that 

iMi > Y.^-- Yl 

Note that 

y(.r. \^G{v)n{V\it)\ , ._ \T{v)r^T{v')r^{V\it)\ 

We shall now show that \J\ft\ > (1 — 7y/e)It holds with the desired probability, which will 
complete the proof of ()4.ip . To achieve this we shall argue differently in the two cases 
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It > In^/^ n and It < In^^^ n. Before we proceed, let us make two auxiliary preparations. 
Note that by property (III) of G we obtain for sufficiently large n that 

G y : \rGiv)\ = {l±3e)pn. 

This, together with (|4.2p implies with a simple double counting argument that 

\- w(r \- \TG{v)nTG{v')niV\lt)\ _ {i±8e) ^ f\TG{u)nlt\\ 

tV^ il±7s){pnr - {pnf \^ y 2 ' 

v,v Glt v,v GM u£V\lt 

We will use these facts in the remainder without further reference. 

First, suppose that It < In^^^n. Note that for each vertex v ^ It at least |rG'(?;)| — In^^^n of 
the edges that are adjacent to it are directed to vertices outside If This implies that 

P(jv„,i),|rcWn(ni.)l in'/^n 1 

^ ' |rG(»)l " (1 - "lejpn " 2 

Therefore, with probability at least 1 — ^ ln~^/^ n, all vertices in It inform a vertex that lies 
outside It, i.e., X^^gx^ Ny = It- However, there is still the possibility that two vertices in It 
inform the same vertex, thus creating a conflict. The probability that such a conflict occurs 
is for large n smaller than 



v,v' GlttVj^v' u£V\Ji 

Note that < |rG'(M) rilt\ < It- Moreover, property (III) in the definition of Tn{p) implies 
that 

\rG{u)nlt\=eG{It,V\It) <2Itpn. 

uev\it 

Under these conditions, as the sum of the binomial coefficient above is a convex function, it is 
bounded from above when we set |rG'(n) n2j| = It for 2pn choices of u, and \rG{u) r\It\ =0 
otherwise. Hence, we obtain for large n that 

E E(C7„,„,)<7r^-2pn-/,^<i ln-V2„. 

So, with probability at least 1 — ^ ln~^/^ \ ln~^^^ n>\ — In^^^^ n we have that \Mt\ = It-, 
which completes the proof for the case It < In^/^ n. 

Finally, we consider the case It > In^/^ n. We wifl first give tight bounds on the expectation of 
|A/t|, and then apply the Azuma-Hoeffding inequality to show that \J\ft\ is sufficiently sharply 
concentrated around E(|A/t|). By using (|4.2p we obtain with plenty of room to spare for 
large n that 



(4.3) 



E f V iV.U V = (1^4.)e.(T,y\T.) (zzz) 
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Moreover, recall that 

,4,4) E -(^-)-^- E ("■"'r''')- 

We are going to estimate the last sum from above as follows. As G G %i{p) we may infer the 
following. 

• If -^t < ^j^, then, by (II), there is X C V \ It such that \X\ < and 

^v^{y\Xt)\X : \Vg(v) r\Xt\ < epn. 

• If ^ < < en = then, by (I), there is A" C F \ Jt such that \X\ < and 

Vt; G (y \Tt) \ ;f : \TG{v)rMt\ < (1 + e)plt < 2epn. 

So, in both cases we have for all v £ {V \ It) \ X that \Tg{v) nXf| < 2epn, and \X\ < \felt- 
Moreover, by exploiting property (III) of G we obtain that for all f G A" it holds |rG(w)nXt| < 
Ipn. Using this, we can bound from above the sum in (|4.4|) by splitting it into two parts as 
follows: 



|2 
'■■t\ 



< Yl \rG{u)nIt\^ + \X\ {2pnf < Y \rG{u)nIt\'' + V^It {2pnf. 

ue(v\it)\x u£(v\it)\x 

Note that < |rG'(n) nlt\ ^ 2epn for every u £ {V \ It) \ X. Moreover, it is easily seen that 
^uGV\it l^ciu) r\It\ = eG{1t, V\'^t)- By the convexity of x^, the sum in the expression above 
is bounded from above if we choose |rG'(it) n2j| = 2epn for eG{1t, y\1t)/{2spn) different u's, 
and \Tg{u) nX^I = otherwise. We obtain 

Y ira(n) nT,P < I^i-lM2epnf ^ ^ \^ep-r?U. 

^-^ epn 2 

u&V\It 

By plugging this into ()4.4p we obtain that ^„ v'&Jt vi^v' ^ ipv^v'^ < Sy^/t. Finally, combined 
with (j4.3p this gives with lots of room to spare that. 

lE(IMI) > {\-^^^e)It. 

To complete the proof we will bound the probability that \Mt\ < hi^ — 7\/e) by using Azuma- 
Hoeffding's inequality. Note that \J\ft\ can change by at most 1, if we modify one of the choices 
made by some vertex in If So, by applying Theorem 12.21 with q = 1 and N = It we obtain 

P(|M| </t(l-7Vi)) <IP(|M| <E(|M|)- v^/t) <e-i^°'^'^ 

thus concluding the proof for the case It > In^^^ n. □ 
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In the next lemma we will consider the "intermediate" phase of the push model be- 
tween Ti{G) and T2{G) for G £ %iip)- Our general strategy is to bound the number Nt of 
vertices that get informed in the current stage t from below. For this, we first estimate E(A^t) 
and then we use concentration inequalities (Theorem 12. 2p to show that with sufficiently high 
probability Nt is very close to K{Nt). 

Lemma 4.2. Uniformly for all G G Tn{p), with probability 1 — o(l) it holds that 

T2iG)-Ti{G) <9e-Hne~\ 
and there are at least en/{2e) uninformed vertices at T2{G). 

Proof. Let 2t be the random set of informed vertices after t stages, and set It := \It\- We will 
show that for Ti(G) < t < T2{G) 



%(G)+rfel > It^{G) • (l + > e^e^'^^ > (1 - 



(4.5) It+i >It[l + ^ 

with probability at least 1 — g-^^^/s^ Lgf^ abbreviate b = 8e~^ lne~^. To see that this is 
sufficient, note that if "T2(G) — Ti (G) < 6" , then there is nothing to prove. On the other hand, 
if "T2(G) - Ti{G) > 6", then with (conditional) probability at least (1 - e-^'"/^)'' = 1 - o(l), 
for [6] consecutive steps after T\{G) the recursion (j4.5|) holds. In turn, this implies with 
1 + X > e^/^, which is valid for small enough x > 0, that 

1, 

Therefore Ix^{G)+\b'\ > ^(1 ~ ^)) from which we obtain with plenty of room to spare that, say, 
T2{G)-Ti{G) < 5 + 1 < 9e-Mne-i. 

Now we turn to the proof of I4.5[ Let t be such that Ti{G) < t < T2{G), and denote by Mt 
the set of vertices in V\It that will be informed by the vertices in It in stage t + 1. Moreover, 
write Nt := \Mt\- We will show that Nt is not much smaller than its expected value. But first 
let us calculate E,{Nt). The definition of the push model implies that the probability that 
any v £ V \ It does not belong to Mt is precisely 

n 



Next we make use of property (I) in the definition of Tn{p): All vertices in V\It, apart from 
an exceptional set X = Xt £ V \ Zt that contains at most 8n/lnn vertices, have (1 ± e)plt 
neighbors in It. Using this and the above fact we may write 



(4.6) EW)= E 1- n (i-^lhw- 



Next we derive tight bounds for the product above. Firstly, observe that property (III) implies 
that for all u £ It 

(4.7) \TG{u)\=np{l±3e). 
Also, the definition of X implies for v £ (V \ It) \ X 

(4.8) \rG{v)nlt\ = {l±e)pIt. 
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Recall that for x > small enough we have <1 — x<e^. So the bounds in ()4.7p 

and (j4.8p imply that for v £ {V \ It) \X we have 

As \{V \ It) \ '^l = ~ -^t)(l =t e), by substituting the above estimate into (|4.6p we obtain 
(4.9) E (iVt) = n (^1 - ^) (l - (l + 0{e)) . 

We will bound the probability that \Nt — E (A't) | > e¥.{Nt) using the Azuma-Hoeffding 
inequality. Firstly, note that as e < < 1 — e, we have E(A^t) > e^n/2, for n sufficiently 
large. Moreover, if we change only one of the random choices of the vertices in It, then Nt 
changes by at most 1. Thus, a simple application of Theorem 12.21 with = 1 and N = It 
yields 

P(|iVt-E(iVj)| >eE(iVt))<2exp -f^ <2exp' 



So, for n sufficiently large, with plenty of room to spare we obtain that 
(4.10) A^t = n(^l-^^ (l-e--) (1±V^) 

with probability 2exp This identity enables us to write a recursive formula concern- 

ing the evolution of the number of informed vertices. Recall that for all < x < 1, we have 
1 - e"^ > x/2. ([irO]) implies that 

(4.11, z„.>,. + „(,_^)A(i_vi)=/.(l + i (l-^)(l-V5)). 

Since /< < (1 — e)n, it follows that for n large enough 

^(l-|)(l-Vi)^(l-Vi)S^. 
By substituting this bound into ()4.1ip we obtain ()4.5p . 

What remains is to show the second statement of the lemma. This follows readily from (j4.10p . 
Indeed, if Ut denotes the number of uninformed vertices after t stages, then observe ffist that 
n (1 — It/n) = Ut- So, for n large enough 



innj , en 

UT,iG) = UT,iG)-i-NT,iG)-i > f/T2(G)-ie-'(^^''''"'^/"(l-e^/F)>-. 

□ 



Finally, we proceed by bounding T{G) — T2{G), for G G %iip)- Let us denote by It the 
set of informed vertices after stage t. Recall that the main strategy in the previous argument 
was to show that the number Nt of vertices that become informed by It in stage t + 1 is close 
to its expected value. To achieve this, we exploited the fact that in G, except of a set X 
of size < all vertices have the "right" degree in It. This argument is unfortunately not 
applicable in the proof of the next lemma: for t > T2{G), the set V \It of not yet informed 
vertices can become much smaller than X, which makes our bounds useless. So we need 
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to argue somehow differently. An additional difficulty is that we are not able to apply the 
Azuma-Hoeffding inequality in a meaningful way. Note that for t > T2{G) the quantity It 
is already of linear order, but the number Nt of newly informed vertices at stage t -\-\ may 
become very small. In this case, the Azuma-Hoeffding inequality gives a trivial upper bound 
and thus the need for a stronger concentration inequality. 

Lemma 4.3. Uniformly for all G G Tnip), with probability 1 — o(l) 

|(r(G) -r2(G)) -lnn| <e^^^lnn. 

Proof. We will split the interval between T2{G) and T(G) into two subintervals. In particular, 
let T'{G) be the first time after T2{G) where at most In^^'^ n uninformed vertices remain. We 
will give separate bounds for T'{G) — T2{G) and T(G) — T'{G). Let for the remainder If be 
the random set of informed vertices after t stages, and set It ■= \It\- 

Let us start with the latter case, as it is the easier among the two. Let t > T'[G). Since 
np > a(n) Inn, it follows from property (III) that for every v £ y \ Xj we have for n large 
enough iTciv) n It\ > np{l — 3e) — In^/^ n > np{l — Ae). So, the probability that a given 
uninformed vertex remains uninformed in the next stage is for large n at most 

1 \«p{i-4£) 2 



1 < e i+ae < . 

np{l + 3e) J e 

Therefore, the probability that such a vertex remains uninformed for at least In ' n steps after 
T'{G) is at most This implies that the expected number of vertices that remain 

uninformed for at least In^/^ n stages after T'(G) is at most In^/^ n(2/e)''^''''' " < " = 

o(l). That is, with probability at least 1 - (2/e)''"''" we have T{G) - T'{G) < In^/^ 

The bound on T'{G) — 72 (G) is significantly more complex. Let Ut denote the number of 
vertices that are still uninformed after the tth stage. We will show that if t is such that 
Ut > In^/^ n, then 

(4.12) Ut+i = Ute-^ (1 ± 50 VF) , 

with probability at least i _ g-^i'^''" "/lo. So if r'(G) - T2(G) > \lnn + 55^/elnn] =: 6i, then 
with conditional probability at least (1 — e~^^^ ^ n/io-jfei+i _ _ ^^-^^^ have 

UT,^G)+b,+i < UT.ioe-"'-' (1 + 50^^)'^+' . 

For large n 

(1 + 50^^)'^+' <e^5v^''^". 
Also, U^^f^Q) < en, which together with the above facts implies that U'p2{G)+bi+i ^ So, we 
may conclude that T'(G) < r2(G) + 6i + 1. 

Similarly, if we assume that T'(G) — T2(G) < [In n — 55-y/e In nj =: 62; then with conditional 
probability at least (1 - e"^''^''" '^/10)''2 = 1 - o(l) we have 

C/t,(g)+6. > UT,(G)e-'' (1 - 50^^)'^ . 

A similar calculation as above, and the fact Uj-^i^q^ > which is guaranteed by Lemma 14.21 
to hold with probability 1 — o(l), shows that 



C/r,(G)+6, > ee^^'^^ » In^^ 



n. 
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Thus, \T'[G) — T2{G) — lnn| < 55y^lnn + 2, which concludes the proof of the lemma. 

It remains to show (j4.12p . As an auxiliary preparation we will show that "most" vertices 
in V \ It have the "right" degree in If, by arguing that if this was not the case, then there 
would be a significant deviation in the number of edges between It and V\It. More precisely, 
let 

X={veV\It: iTciv) n It\ < (1 - 3V^)pn} . 
In the sequel we argue that 

(4.13) \X\ <3^/e{n- It). 

Indeed, as we assumed that G £ In{p), property (III) guarantees that ecilt^V \It) > 
It{n — It)p{^ — 3e). Moreover, property (III) implies that every vertex v has degree at most 
(1 + 3e)pn. Therefore 

eG{It,V\It) < - 3^/I)pn + {1 + 3e){n - It - \X\)pn. 

By putting the upper and the lower bounds together we obtain 

Itin - It)p{l - 3e) < -3\X\pn{y/I + e) + (1 + 3e)(n - It)pn, 
which implies with /*>(! — e)n that 

1 - 4e < -3^^(V^ + e) + (l + 3e). 



n 



An elementary calculation shows that the claim (j4.13p holds. 

Now let V £ {y \ It) \ X . The probability that v becomes informed in the next stage is 

Denote by Nt the set of vertices in V \ It that will be informed by the vertices in It in stage 
t + 1. Moreover, write Nt = \Mt\- So, by linearity of expectation, for n large enough we obtain 



(4.14) 



E {Nt) = {n- It) (^1 - ^) (1 - 7V^) ± 3^/^(n - It) 



(n 



It) (^1-^) {l±UV~e). 



Next we will show that Nt is with sufficiently high probability close to its expected value. Note 
that the Azuma-Hoeffding inequality does not give any meaningful bounds, as the number 
of the independent random variables is It > (1 ~ s)'^; while the expected value of Nt is 
proportional only to n — If. The latter will eventually become so small that the exponent 
in the Azuma-Hoeffding inequality is o(l), thus yielding a trivial bound. To bypass this 
problem, we will use Talagrand's inequality (Theorem 12. 3p . Note first that the bounded 
differences condition is satisfied, that is, changing one random choice can change Nt by at 
most 1. Regarding the second condition, note that if Nt = r, then there must be at least r 
vertices in It that have informed the vertices in Mt- Therefore, we may take ip{r) = \r~\ and 
with m{Nt) denoting the median of Nt, we deduce for any x > that 

(4.15) F {\Nt - m{Nt)\ > x) < Aexp ( < 4exp ^ 



4(m(A^f ) + x)J - V 4(2E(iVt) + x) 
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where in the last inequahty we have used the fact that E{Nt) > m{Nt)¥{Nf > ■m{Nt)) > 
m{Nt)/2, which imphes that m{Nt) < 2E{Nt). However, we need to argue about the distance 
of m[Nt) from E(A^f). We will use (|2.3p . The triangle inequahty yields: 

\Nt - E{Nt)\ = \Nt - m{Nt) + m{Nt) - E{Nt)\ 

ESI 



< \Nt - m{Nt)\ + \Nt - m{Nt)\ = \Nt - m{Nt)\ + O {^/HNt)) • 

Since a(n) < In^^^n, we have sf&^Nt) = o{^E{Nt)). Therefore, for sufficiently large n 

(4.16) \Nt - E {Nt))\ >x ^ \Nt- m{Nt)\ >x- V^E{Nt). 
Therefore, using (|4.16p in (j4.15|) with x = ^/£K (Nf) we obtain 

(4.17) ¥i\Nt-E{Nt)\ > 2V^E(iVi)) < 4exp (- ^(f^j^^ ) • 

Since n — It > In^^/^n, by (j4.14p we obtain that, say, E{Nt) > ^ So, for large n, the 



\Nt - E{Nt)\ > 2^E{Nt)) < exp 



:lnl/2 



n 



40 / 

By putting everything together we obtain that with probability at least 1 — e~^^^ ^ "/^'^ 

Nt = {n- It) (^1 - ^) (1 ± 16^/^) . 

So there remain (very generously) {n — It)e~^ (1 it 60^/e) vertices uninformed in V\If This 
completes the proof of (I4.12|) . □ 

Finally, note that the bounds obtained in Lemmas I4.1H4.3I imply Theorem II. H thus con- 
cluding our proof. 
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